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Abstract: In this chapter, one studies the famous well-known and challenging 
Tweety Penguin Triangle Problem (TPTP or TP2) pointed out by Judea Pearl in 
one of his books. We first present the solution of the TP2 based on the fallacious 
Bayesian reasoning and prove that reasoning cannot be used to conclude on the abil- 
ity of the penguin-bird Tweety to fly or not to fly. Then we present in details the 
counter-intuitive solution obtained from the Dempster-Shafer Theory (DST). Fi- 
nally, we show how the solution can be obtained with our new theory of plausible and 


paradoxical reasoning (DSmT). 


12.1 Introduction 


udea Pearl claimed that DST of evidence fails to provide a reasonable solution for the combination 
J of evidence even for apparently very simple fusion problem [IJ] [12]. Most criticisms are answered by 
Philippe Smets in 22123]. The Tweety Penguin Triangle Problem (TP2) is one of the typical exciting and 
challenging problem for all theories managing uncertainty and conflict because it shows the real difficulty 
to maintain truth for automatic reasoning systems when the classical property of transitivity (which is 


basic to the material-implication) does not hold. In his book, Judea Pearl presents and discusses in 
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details the semantic clash between Bayes vs. Dempster-Shafer reasoning. We present here our analysis 
on this problem and provide a new solution of the Tweety Penguin Triangle Problem based on our new 
theory of plausible and paradoxical reasoning, known as DSmT (Dezert-Smarandache Theory). We show 
how this problem can be attacked and solved by our new reasoning with help of the (hybrid) DSm rule 
of combination (see chapter Æ). The purpose of this chapter is not to browse all approaches available in 
literature for attacking the TP2 problem but only to provide a comparison of the DSm reasoning with 
respect to the Bayesian reasoning and to the plausible reasoning of DST framework. Interesting but 
complex analysis on this problem based on default reasoning and e-belief functions can be also found 
by example in [22] and [I]. Other interesting and promising issues for the TP2 problem based on the 
fuzzy logic of Zadeh jointly with the theory of possibilities are under investigations. Some 
theoretical research works on new conditional event algebras (CEA) have emerged in literature [7] since 
last years and could offer a new track for attacking the TP2 problem although unfortunately no clear 
didactic, simple and convincing examples are provided to show the real efficiency and usefulness of these 


theoretical investigations. 


12.2 The Tweety Penguin Triangle Problem 


This very important and challenging problem, as known as the Tweety Penguin Triangle Problem (TP2) 
in literature, is presented in details by Judea Pearl in [I]. We briefly present here the TP2 and the 
solutions based first on fallacious Bayesian reasoning and then on the Dempster-Shafer reasoning. We 


will then focus our analysis of this problem from the DSmT framework and the DSm reasoning. 


Let’s consider the set R = {r1,r2,r3} of given rules (as known as defaults in [IJ): 
e rı: "Penguins normally don’t fly” <= (p > ~f) 
e ro: "Birds normally fly” = (b —> f) 
e r3: "Penguins are birds” & (p — b) 


To emphasize our strong conviction in these rules we commit them some high confidence weights w1, w2 
and ws in [0,1] with wı = 1— 1, w2 = 1 — 2 and w3 = 1 (where €; and e€2 are small positive quantities). 


The conviction in these rules is then represented by the set W = {w1, w2, w3 } in the sequel. 


Another useful and general notation adopted by Judea Pearl in the first pages of his book [I] to 


characterize these three weighted rules is the following one (where w1, w2, w3 € [0, 1]): 


Ww 


mip (af) re:b3f ra:p3b 
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When w1, w2, w3 € {0,1} the classical logic is the perfect tool to conclude on the truth or on the falsity 
of a proposition built from these rules based on the standard propositional calculus mainly with its three 
fundamental rules (Modus Ponens, Modus Tollens and Modus Barbara - i.e. transitivity rule). When 
0 < w1, w2,w3 < 1, the classical logic can’t be applied because the Modus Ponens, the Modus Tollens 
and the Modus Barbara do not longer hold and some other tools must be chosen. This will discussed in 


detail in section 3.2. 


Question: Assume we observe an animal called Tweety (T) that is categorically classified as a bird (b) 
and a penguin (p), ie. our observation is O = [T = (bN p)| = [((T = b) N (T = p)]. The notation 
T = (bN p) stands here for ” Entity T holds property (b N p)”. What is the belief (or the probability - if 
such probability exists) that Tweety can fly given the observation O and all information available in our 


knowledge base (i.e. our rule-based system R and W) ? 


The difficulty of this problem for most of artificial reasoning systems (ARS) comes from the fact 
that, in this example, the property of transitivity, usually supposed satisfied from material-implication 
interpretation [LI], (p — b,b — f) = (p > f) does not hold here (see section 12.3.2). In this interesting 
example, the classical property of inheritance is thus broken. Nevertheless a powerful artificial reasoning 
system must be able to deal with such kind of difficult problem and must provide a reliable conclusion 
by a general mechanism of reasoning whatever the values of convictions are (not only restricted to values 
close to either 0 or 1). We examine now three ARS based on the Bayesian reasoning [LI] which turns to 
be fallacious and actually not appropriate for this problem and we explain why, on the Dempster-Shafer 


Theory (DST) and on the Dezert-Smarandache Theory (DSmT) (see part I of this book). 


12.3 The fallacious Bayesian reasoning 


We first present the fallacious Bayesian reasoning solution drawn from the J. Pearl’s book in (pages 
447-449) and then we explain why the solution which seems at the first glance correct with intuition is 
really fallacious. We then explain why the common rational intuition turns actually to be wrong and 


show the weakness of Pearl’s analysis. 


12.3.1 The Pearl’s analysis 


To preserve mathematical rigor, we introduce explicitly all information available in the derivations. In 
other words, one wants to evaluate using the Bayesian reasoning, the conditional probability, if it exists, 


P(T = f|O,R,W) = P(T = f|T = p,T =b, R,W). The Pearl’s analysis is based on the assumption that 








a conviction on a given rule can be interpreted as a conditional probability (see [LI] page 4). In other 
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words if one has a given rule a “ b with w € [0,1] then one can interpret, at least for the calculus, w as 
P(b|a) and thus the probability theory and Bayesian reasoning can help to answer to the question. We 
prove in the following section that such model cannot be reasonably adopted. For now, we just assume 
that such probabilistic model holds effectively as Judea Pearl does. Based on this assumption, since the 
conditional term/information (T = p,T = b, R, W) is strictly equivalent to (T = p, R, W) because of the 
knowledge of rule r3 with certainty (since w3 = 1), one gets easily the fallacious intuitive expected Pearl’s 


result: 





P(T = f|O, R,W) = P(T = f|T = p,T = b, R,W) 





P(T = f|O, R,W) = P(T = f|T = p, R, W) 


P(T = f|O,R,W)=1- P(T =>f|T =p, R,W) 





P(T = f|O,R,W) = 1- w = €61 


From this simple analysis, the Tweety’s ” birdness” does not render her a better flyer than an ordinary 
penguin as intuitively expected and the probability that Tweety can fly remains very low which looks 
normal. We reemphasize here the fact, that in his Bayesian reasoning J. Pearl assumes that the weight 
w for the conviction in rule rı can be interpreted in term of a real probability measure P(~f|p). This 
assumption is necessary to provide the rigorous derivation of P(T = f|O,R,W). It turns out however 
that convictions w; on logical rules cannot be interpreted in terms of probabilities as we will prove in the 


next section. 


When rule r3 is not asserted with absolute certainty (i.e. w3 = 1) but is subject to exceptions, i.e. 
w3 = 1 — e3 < 1, the fallacious Bayesian reasoning yields (where notations T = f, T = b and T = p are 


replaced by f, b and p due to space limitations): 


P(F|O, R, W) = P(f|p, b, R, W) 


b|R,W 
PUIORW) = pein) 
Pio, R,w) = PPP BW) POR, W) 


P(b|p, R, W)P(p|R, W) 


By assuming P(p|R,W) > 0, one gets after simplification by P(p|R, W) 


P(f,b|p, R, W 

P(f|O, R, W) = nee ue 
Ww P(b R,W)P R.W 
P(f|O,R, ) = Do - i 


If one assumes P(b|p, R, W) = w3 = 1 — e3 and P(f|p, R,W) = 1 — P(~f|p,R,W) = 1 — wi = «1, one 
gets 


P(f|O, R, W) = POf, p, RW) x | ° 
ae 
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Because 0 < P(b|f, p, R, W) < 1, one finally gets the Pearl’s result [I] (p.448) 


€1 


P(F|O, R,W) < | (12.1) 


— €3 

which states that the observed animal Tweety (a penguin-bird) has a very small probability of flying 
as long as €3 remains small, regardless of how many birds cannot fly (€2), and has consequently a high 
probability of not flying because P(f|O,R,W)+P(f|O,R,W) = 1 since the events f and f are mutually 


exclusive and exhaustive (assuming that the Pearl’s probabilistic model holds ... ). 


12.3.2 The weakness of the Pearl’s analysis 


We prove now that the previous Bayesian reasoning is really fallacious and the problem is truly unde- 
cidable to conclude about the ability of Tweety to fly or not to fly if a deep analysis is done. Actually, 
the Bayes’ inference is not a classical inference (see chapter B] for justification). Indeed, before applying 
blindly the Bayesian reasoning as in the previous section, one first has to check that the probabilistic 
model is well-founded to characterize the convictions of the rules of the rule-based system under anal- 
ysis. We prove here that such probabilistic model doesn’t hold for a suitable and useful representation 
of the problem and consequently for any problems based on the weighting of logical rules (with positive 


weighting factors/convictions below than 1). 


12.3.2.1 Preliminaries 


We just remind here only few important principles of the propositional calculus of the classical Mathe- 
matical Logic which will be used in our demonstration. A simple notation, which may appear as unusual 
for logicians, is adopted here just for convenience. A detailed presentation of the propositional calculus 
and Mathematical Logic can be easily found in many standard mathematical textbooks like [15] [10] D]. 


Here are these important principles: 


e Third middle excluded principle : A logical variable is either true or false, i.e. 
aV 7a (12.2) 
e Non-contradiction law : A logical variable can’t be both true and false, i.e. 
a(a A 7a) (12.3) 


e Modus Ponens : This rule of the propositional calculus states that if a logical variable a is true 


and a — b is true, then b is true (syllogism principle), i.e. 


(a^ (a— b)) > b (12.4) 
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e Modus Tollens : This rule of the propositional calculus states that if a logical variable ~b is true 


and a — b is true, then ~a is true, i.e. 


(=b A (a > b)) > ~a (12.5) 


e Modus Barbara : This rule of the propositional calculus states that if a — b is true and b —> c is 


true then a — c is true (transitivity property), i.e. 


((a +b) A(b +0) 3 (a 0) (12.6) 


From these principles, one can prove easily, based on the truth table method, the following property 


(more general deducibility theorems in Mathematical Logic can be found in [T8}{T9)) : 


((a > b) A (c > d)) > (la Ac) > (bA d)) (12.7) 


12.3.2.2 Analysis of the problem when ¢; = €2 = €3 = 0 


We first examine the TP2 when one has no doubt in the rules of our given rule-based systems, i.e. 


ri : p “T595 (af) 
To: b wmi eam f 


w3=l-—e3=1 
T3: p => b 


From rules rı and rz and because of property 2.7), one concludes that 
pAb (fA-f) 


and using the non-contradiction law (12:3) with the Modus Tollens (2.5), one finally gets 


a(f Anf) > >(pA b) 


which proves that p A^ b is always false whatever the rule r3 is. Interpreted in terms of the probability 
theory, the event T = pN b corresponds actually and truly to the impossible event Ø since T = f and 
T = f are exclusive and exhaustive events. Under such conditions, the analysis proves the non-existence 


of the penguin-bird Tweety. 


If one adopts the ee | of the probability theory, trying to derive P(T = f|T = pN b) and 
P(T = f|T = pnb) with the Bayesian reasoning is just impossible because from one of the axioms of the 
probability theory, one must have P(Ø) = 0 and from the conditioning rule, one would get expressly for 


this problem the indeterminate expressions: 
l Because probabilities are related to sets, we use here the common set-complement notation f instead of the logical 


negation notation =f, N for A and U for V if necessary. 
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Pease a 
Params) 2, 

P(T = f|T = pnb) (indeterminate) 

and similarly 

P(T = f|T = pN b) = P(T = f|T = Ø) 
C "CTT 

P(T = fiT =pNb) S 

P(T = fT =pNb) (indeterminate) 


12.3.2.3 Analysis of the problem when 0 < €1, €2,€3 < 1 


Let’s examine now the general case when one allows some little doubt on the rules characterized by taking 


€1 20, €2 2 0 and eg Z 0 and examine the consequences on the probabilistic model on these rules. 


First note that, because of the third middle excluded principle and the assumption of the existence 
of a probabilistic model for a weighted rule, then one should be able to consider simultaneously both 


” probabilistic/Bayesian” rules 


(12.8) 


In terms of classical (objective) probability theory, these weighted rules just indicate that in 100 x w 
percent of cases the logical variable b is true if a is true, or equivalently, that in 100 x w percent of cases 
the random event b occurs when the random event a occurs. When we don’t refer to classical probability 
theory, the weighting factors w and 1 — w indicate just the level of conviction committed to the validity 
of the rules. Although very appealing at the first glance, this probabilistic model hides actually a strong 
drawback/weakness especially when dealing with several rules as shown right below. 

Let’s prove first that from a ” probabilized” rule a a b one cannot assess rigorously the convic- 


tions onto its Modus Tollens. In other words, from (12.8) what can we conclude on 


(12.9) 
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From the Bayes’ rule of conditioning (which must hold if the probabilitic model holds), one can express 


P(a|b) and P(a|b) as follows 


P(alB) = 1 — Plalb) = 1 FG) = 1 = PLE 


a P(anb P(bla)P(a 
Pab =1— P(alb) =1— PL) =1— PORNO 


or equivalently by replacing P(b|a) and P(b\a) by their values w and 1 — w, one gets 


P(ajb) =1-(1-w) ee (12.10) 


= P(a 
P(a|b) =1- we 
These relationships show that one cannot fully derive in theory P(a|b) and P(a|b) because the prior 


probabilities P(a) and P(b) are unknown. 


A simplistic solution, based on the principle of indifference, is then just to assume without solid jus- 
tification that P(a) = P(@) = 1/2 and P(b) = P(b) = 1/2. With such assumption, then one gets the 
following estimates P(a|b) = w and P(a|b) = 1 — w for P(a|b) and P(a\b) respectively and we can go 


further in the derivations. 


Now let’s go back to our Tweety Penguin Triangle Problem. Based on the probabilistic model (assumed 


to hold), one starts now with both 





P(f\p)=1- P(f\p)= 
rip (flp)=1 ey p (Flp) ng 
PO P(flb)=1-e2 f p PO b)=e2 -f (12.11) 
P(b|p)=1-e3 P(b|p)=es 
T3: p > b p = `b 
Note that taking into account our preliminary analysis and accepting the principle of indifference, one 


has also the two sets of weighted rules either 


f PPS lse Ap zf P(p|f)=er Ap 
ap POR ‘ae (12.12) 
_p POP st y p Pema 


One wants to assess the convictions (assumed to correspond to some conditional probabilities) into the 


following rules 


pab a ¢ (12.13) 


(flpnb)=? 
= 


pab” ae (12.14) 
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The question is to derive rigorously P(f|pN b) and P(f|pMb) from all previous available information. It 
turns out that the derivation is impossible without unjustified extra assumption on conditional indepen- 


dence. Indeed, P(f|pN b) and P(f|p Nb) are given by 


_ P(f.p,b) _ P(p,blf) PF 
P(f|pab) = Peay = ET 


(12.15) 


= _ P(f.p,b) _ P(p,b|f) PF. 
P(Flpb) = Poa = PODPO 


If one assumes as J. Pearl does, that the conditional independence condition also holds, i.e. P(p, b| f) = 


P(p|f)P(O|f) and P(p, bf) = P(p|f)P(O|f), then one gets 


— PlplfyPOlfyPF 
P(FIPN’) = PB Pay 


F — PIF PEOIFPCE) 
P(flp 1b) = A anre 
By accepting again the principle of indifference, P(f) = P(f) = 1/2 and P(p) = P(p) = 1/2, one gets 
the following expressions 


P(f lpn b) = PE Te 


(12.16) 
D(F — P(lf)P(blf) 
P(f|pnb) = ”P(bip) 
Replacing probabilities P(p| f), P(b|f), P(b|p), P(p|f) and P(b| f) by their values in the formula (2.19), 
one finally gets 
1 _ €1(1-e2) 
P(f|pnb) = “ie 


(12.17) 


x8 i- 
P(flpnb) = "Ee 

Therefore we see that, even if one accepts the principle of indifference together with the conditional 

independence assumption, the approximated ” probabilities” remain both small and do not correspond to 


a real measure of probability since the conditional probabilities of exclusive elements f and f do not add 


up to one. When €1, €2 and e3 tends towards 0, one has 
P(flpnb) + P(flpnb) = 0 


Actually our analysis based on the principle of indifference, the conditional independence assumption 
and the model proposed by Judea Pearl, proves clearly the impossibility of the Bayesian reasoning to 
be applied rigorously on such kind of weighted rule-based system, because no probabilistic model exists 
for describing correctly the problem. This conclusion is actually not surprising taking into account the 


Lewis’ theorem [I3] explained in details in [7] (chapter 11). 


274 CHAPTER 12. ON THE TWEETY PENGUIN TRIANGLE PROBLEM 


Let’s now explain the reason of the error in the fallacious reasoning which was looking coherent with 
the common intuition. The problem arises directly from the fact that penguin class and bird class are 
defined in this problem only with respect to the ”flying” and ”not-flying” properties. If one considers 
only these properties, then none Tweety animal can be categorically classified as a penguin-bird, because 
penguin-birdness doesn’t not hold in reality based on these exclusive and exhaustive properties (if we 
consider only the information given within the rules rı, r2 and r3). Actually everybody knows that 
penguins are effectively classified as bird because ”birdness” property is not defined with respect to 
the ” flying” or ”not-flying” abilities of the animal but by other zoological characteristics C (birds are 
vertebral oviparous animals with hot blood, a beak, feather and anterior members are wings) and such 
information must be properly taken into account in the rule-based systems to avoid to fall in the trap of 
such fallacious reasoning. The intuition (which seems to justify the fallacious reasoning conclusion) for 
TP2 is actually biased because one already knows that penguins (which are truly classified as birds by 
some other criterions) do not fly in real world and thus we commit a low conviction (which is definitely 
not a probability measure, but rather a belief) to the fact that a penguin-bird can fly. Thus the Pear’ls 


analysis proposed in [IJ] appears to the authors to be unfortunately incomplete and somehow fallacious. 


12.4 The Dempster-Shafer reasoning 


As pointed out by Judea Pearl in [I], the Dempster-Shafer reasoning yields, for this problem, a very 
counter-intuitive result: birdness seems to endow Tweety with extra flying power ! We present here our 


analysis of this problem based on the Dempster-Shafer reasoning. 


Let’s examine in detail the available prior information summarized by the rule r1: ” Penguins normally 
don’t fly’ <= (p > 7f) with the conviction w = 1 — €, where €; is a small positive number close to zero. 
This information, in the DST framework, has to be correctly represented in term of a conditional belief 


Bel, (flp) = 1 — 1 rather than directly the mass mi(f N p) = 1-1. 


Choosing Bel; (f|p) = 1 — €, means that there is a high degree of belief that a penguin-animal is also 
a nonflying-animal (whatever kind of animal we are observing). This representation reflects perfectly 
our prior knowledge while the erroneous coarse modeling based on the commitment m1(f Mp) = 1 — «1 
is unable to distinguish between rule rı and another (possibly erroneous) rule like r| : (=f — p) hav- 
ing same conviction value wı. This correct model allows us to distinguish between rı and ri (even if 
they have the same numerical level of conviction) by considering the two different conditional beliefs 
Beli(f|p) = 1 — 1 and Bely (p| f) = 1 — e1. The coarse/inadequate basic belief assignment modeling (if 


adopted) in contrary would make no distinction between those two rules rı and ri since one would have 
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to take mi(f N p) = my (pN f) and therefore cannot serve as the starting model for the analysis 


Similarly, the prior information relative to rules rg : (b — f) and r3 : (p — b) with convictions 
wz = 1— é2 and w3 = 1 — eg has to be modeled by the conditional beliefs Belo(f|b) = 1 — e2 and 


Bels(b|p) = 1 — €3 respectively. 


The first problem we have to face now is the combination of these three prior information character- 
ized by Beli(f|p) = 1— 1, Belo(f|b) = 1 — e2 and Belz (b|p) = 1 — e3. All the available prior information 
can be viewed actually as three independent bodies of evidence 6;, Bz and Bs providing separately the 
partial knowledges summarized through the values of Bel;(f|p), Belo(f|b) and Bel3(b|p). To achieve the 
combination, one needs to define complete basic belief assignments mj (.), m2(.) and m3(.) compatible 
with the partial conditional beliefs Bel, (f|p) = 1 — 1, Belo(f\b) = 1 — e2 and Bel3(b|p) = 1 — e3 without 
introducing extra knowledge. We don’t want to introduce in the derivations some extra-information we 
don’t have in reality. We present in details the justification for the choice of assignment m1(.). The choice 


for m2(.) and ms(.) will follow similarly. 


The body of evidence B, provides some information only about f and p through the value of Bel, (f|p) 
and without reference to b. Therefore the frame of discernment ©, induced by 6, and satisfying Shafer’s 


model (i.e. a finite set of exhaustive and exclusive elements) corresponds to 
O1 = {01 = f NP, 02 = fp, 03 = f Np, 04 = fp} 


schematically represented by 


p=03U04 
A N 


O42 fNp ee, ais 
a SA 3 
22fnp HF fNp 


N “ A 
pP=01U02 


iN 


f = 02U 644 


The complete basic assignment m; (.) we are searching for and defined over the power set 2°: which must 
be compatible with Bel, (f|p) is actually the result of the Dempster’s combination of an unknown (for 
now) basic belief assignment m/ (.) with the particular assignment m//(.) defined by mY (p = 03 U 04) = 1; 
in other worlds, one has 


mx(.) = [m ® mi](.) 


From now on, we introduce explicitly the conditioning term in our notation to avoid confusion and thus we 
use m1(.|p) = mı (.|03 U 04) instead m1(.). From m} (p £ 03 U 04) = 1 and from any generic unknow basic 
assignment m‘(.) defined by its components m; (Ø) = 0, mi (01), m1 (02), m (03), m4,(04), m; (01 U 02), 


m/ (64 U 03), mi (64 U 04), m} (62 U 03), m/ (02 U 04), m} (03 y 04), mi (64 U b2 U 63), m/ (61 U 02 U 64), 
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mi (01 U 63 U 64), m4 (02 U 03 U 04), m4 (01 U 02 U 63 U 04) and applying Dempter’s rule, one gets easily the 


following expressions for m1(.|@3 U 04). All m1(.|@3 U 04) masses are zero except theoretically 


1 


mı (63 03 U 04) = mi (03 U 64) m/ (03) + mi (01 U 03) + mi (02 U 03) + mi (01 U 62 U 03) /Kı 


1 
~ 


mı (04 03 U 04) = mi (03 U 64) mi (84) + mi (81 U 04) + mi (02 U 64) + mi (0i U ĝ2 U 04)|/ By 


1 
~ 


mı (63 U 04/93 U 04) = mi (03 U 64) m/ (03 U 64) +m; (01 U 43 U 64) + m; (62 U 03 U 04) +m; (01 U 62 U 03 U 84)]/Kı 

















with r 
AN 


r ` 
Kı £ 1— mi (03 U 64) [m4 (01) + m/ (02) + mi (01 U 62)| 
To complete the derivation of mı (.|03 U04), one needs to use the fact that one knows that Bel: (flp) = 
1 — €, which, by definition [I6], is expressed by 
Bel: (flp) = Bel, (61 U 03|43 U 64) = mı (81|03 U 04) + m1(03|03 U 64) + mı (01 U 63/63 U 04) = 1 — ey 
But from the generic expression of m1(.|@3 U 04), one knows also that mı (01/03 U 64) = 0 and mı (8ı U 
63|03 U 04) = 0. Thus the knowledge of Bel: (f|p) = 1 — e1 implies to have 
mı (03183 U 64) = [mi (03) + mi (1 U 63) + m’ (82 U 03) + mi (1 U fz U 03) / Kı =1l-¢« 


This is however not sufficient to fully define the values of all components of mı (.|03U64) or equivalently 
of all components of m/ (.). To complete the derivation without extra unjustified specific information, one 
needs to apply the minimal commitment principle (MCP) which states that one should never give more 
support to the truth of a proposition than justified [8]. According to this principle, we commit a non 
null value only to the less specific proposition involved into m1(63|03 U 64) expression. In other words, 
the MCP allows us to choose legitimately 

mi (01) = mi (82) = mi (03) = 0 
mi (01 U 62) = mi (81 U 63) = m/i (02 U 43) = 0 


mi (01 U b2 U 03) A 0 
Thus Ky = 1 and m1(03|03 U 04) reduces to 
mı (0383 U 04) = mi (0 U > U 63) =1l1-€) 


Since the sum of basic belief assignments must be one, one must also have for the remaining (uncom- 


mitted for now) masses of m{(.) the constraint 


mi (84) + mi (01 U 84) + m4 (02 U 04) + mi (01 U 62 U 04) 





+m, (03 U 64) + mi (01 U 43 U 04) + mi (02 U ĝ3 U 64) 


+m} (01 U 02 U 03 U 04) = 1 
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By applying a second time the MCP, one chooses m/ (61 U 02 U 03 U 04) = €1. 


Finally, the complete and less specific belief assignment mj(.|p) compatible with the available prior 


information Bel, (f|p) = 1 — «1 provided by the source B, reduces to 


mı (03103 U 04) = m\ (01 U 02 U 03) = l — <€ (12.18) 
mı (03 U 84/03 U 84) =m‘ (01 U 02 U 83 U 04) = i (12.19) 
or equivalently 
mi(f N plp) = mi (BU f)=1- 6 (12.20) 
mı(plp) = mi (pUfUpUf)=4 (12.21) 


It is easy to check, from the mass m;(.|p), that one gets effectively Bely(f|p) = 1 — «1. Indeed: 


Bel: (f|p) = Bel: (0; U 63|p) 
Beli (flp) = Beli ((F N P) U (FN p)|p) 


Beli (flp) = m (FA plp) +m (FA plp) 
0 


+mi((f Np) U (FN p)|p) 
N “ A 
0 
Bel: (Flp) = mı (F N plp) 
Bel: (Flp) =1l-¢ 
In a similar way, for the source Bz with Op» defined as 


O2 = {91 = fb, 02 EDN fF, 63 = f Nb, 04 = f Nb} 


schematically represented by 


one looks for mə2(.|b) = [ms P m4](.) with m5 (b) = m3 (03 U 64) = 1. From the MCP, the condition 
Belg(f|b) = 1 — eg and with simple algebraic manipulations, one finally gets 
m2(3|03 U 04) = m4 (01 U 02 U 03) =1l-e€ (12.22) 


mə(03 U 84|03 U 84) = m4 (01 U 02 U 83 U 04) = €2 (12.23) 
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or equivalently 


mo(f b/d) = m4 (bU f) =1—e2 (12.24) 


m2(blb) = mh (bU f UbU f) = e2 (12.25) 
In a similar way, for the source Bs with Og defined as 
O; = {01 Ê bp, 42 = bN p, 03 = pN b, 94 = bN p} 


schematically represented by 


p=03U04 
r ZN N 
F b bN 634 
b=O0H{ 8 8” l bsub 
6,.2bNp 0 bnp 
N “ A 
p=01U02 


one looks for m3(.|p) = [m5 ® m$](.) with m3(p) = m3 (03 U 04) = 1. From the MCP, the condition 


Bel3(b|p) = 1 — e3 and with simple algebraic manipulations, one finally gets 


m3(3|03 U 04) = m4 (01 U 02 U 03) Syl: — €3 (12.26) 
m3(63 U 04103 U 04) = mh (0 U A U 03 U 64) = 63 (12.27) 
or equivalently 
m3(bN plp) = m3(pU b) = 1 — 63 (12.28) 
ms(p|p) = m4 (b U PU bU p) =e; (12.29) 


Since all the complete prior basic belief assignments are available, one can combine them with the 
Dempster’s rule to summarize all our prior knowledge drawn from our simple rule-based expert system 


characterized by rules R = {r1, r2, r3} and convictions/confidences W = {w1, w2, w3} in these rules. 


The fusion operation requires to primilarily choose the following frame of discernment © (satisfying 
Shafer’s model) given by 
O= {61, b2, 03, 04, 05, 06, 07, Os} 


where 


6,2 fNAbNAp 6,2 fAbNAp 
6,2 fAbNAP bs £ fNbND 
6,2 fAbDNAp 67 fNAbAp 


642 fNbND 6g = fnbnp 
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The fusion of masses m;(.) given by eqs. 220-0221) with m2/(.) given by eqs. 224-02-25) 
using the Demspter’s rule of combination yields m42(.) = [mi ® mg|(.) with the following non null 


components 


mi2(f NbN p) = & (1 — €2)/Ki2 
miz(f NON p) = e(1 — 6&1 )/Kı2 


mi2 (b N p) = e&€2/K12 


with Kız £ 1— (1 = €1)(1 = €2) = €] + €2 — €1€2. 


The fusion of all prior knowledge by the Dempster’s rule m123(.) = [m1 E m2 ®ms](.) = [m12 © m3](.) 


yields the final result : 


mio3(f NbN p) = mi23(61) = e1(1 — €2)/ K123 
myo3(f NbN p) = m123 (85) = €2(1 — €61 )/ K123 


mı23(b N p) = M123 (61 U 05) = €1€2/K 123 


with Ky23 = Kız 4 1— (1 = €1)(1 sa €2) = €] + €2 — €4€2. 


which defines actually and precisely the conditional belief assignment mj423(.|p b). It turns out that the 
fusion with the last basic belief assignment m3(.) brings no change with respect to previous fusion result 


m49(.) in this particular problem. 


Since we are actually interested to assess the belief that our observed particular penguin-animal named 
Tweety (denoted as T = (pN b)) can fly, we need to combine all our prior knowledge mj423(.) drawn from 
our rule-based system with the belief assignment mo(T = (pM b)) = 1 characterizing the observation 
about Tweety. Applying again the Demspter’s rule, one finally gets the resulting conditional basic belief 


function ™o123 = [Mo ® M123](.) defined by 
Mo123(T = (f @) bA DIT = (p A b)) = e(l = €2)/Ki2 


Mo123(T = (fn bN p)|T = (pn b)) = €9(1 = €1)/Ki2 


Mo123(T = (b N DIT = (p M b)) = €1€2/Ky2 





From the Dempster-Shafer reasoning, the belief and plausibity that Tweety can fly are given by 


Bel(T = f|T = (pnb)) = 5 Mo123(T = z|T = (pN b)) 


zE29, «Cf 


PLT = f|T = (pN b)) = 5 Mo123(T = 2|T = (pN b)) 
zE29 sN f0 
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Because f = (f NbN p)U(fNbNDP)U(fNbNp)U(f NbN >»|)] and the specific values of the masses 


defining mo123(.), one has 
Bel(T = f|T = (pN b)) = moia3(T = (f NHN p)|T = (pN b)) 


PI(T = f|T = (pN b)) = moiaa(T = (f NON p)|T = (p N b)) + mMoiaa(T = (bN p)|T = (pN b)) 


and finally 
1— 
Bel(T = fiT = (pny) = BT @) (12.30) 
Kı2 
PUT = f\T =(pnb)) = S07 2) 4982 _ 4 (12.31) 
Kia Ki2g Kız 
In a similar way, one will get for the belief and the plausibility that Tweety cannot fly 
2 1— 
Bel(T = f|T = (pnb)) = eal — ¢1) (12.32) 
Kio 
PUT =T= (pnb) = 2079 4 12 _ (12.33) 
Kia Kig Kız 


Using the first order approximation when €, and €2 are very small positive numbers, one gets finally 


Bel(T = f|T = (pN b)) = PUT = fIT = (pnb) =  “' 
€; + €2 
In a similar way, one will get for the belief that Tweety cannot fly 
Bel(T = f|T = (pnb)) = P(T = fIT = (pnb) = ° 
€1 +€2 


This result coincides with the Judea Pearl’s result but a different analysis and detailed presentation 
has been done here. It turns out that this simple and complete analysis corresponds actually to the 
ballooning extension and the generalized Bayesian theorem proposed by Smets in [21] [24] and discussed 
by Shafer in although it was carried out independently of Smets’ works. As pointed out by Judea 
Pearl, this result based on DST and the Dempster’s rule of combination looks very paradoxical/counter- 
intuitive since it means that if nonflying birds are very rare, i.e. €2 œ% 0, then penguin-birds like our 
observed penguin-bird Tweety, have a very big chance of flying. As stated by Judea Pearl in [I] pages 
448-449: ”The clash with intuition revolves not around the exact numerical value of Bel(f) but rather 
around the unacceptable phenomenon that rule r3, stating that penguins are a subclass of birds, plays no 
role in the analysis. Knowing that Tweety is both a penguin and a bird renders Bel(T = f|T = (pA b)) 
solely a function of mı(.) and ma(.), regardless of how penguins and birds are related. This stands 
contrary to common discourse, where people expect class properties to be overridden by properties of more 
specific subclasses. While in classical logic the three rules in our example would yield an unforgivable 


contradiction, the uncertainties attached to these rules, together with Dempster’s normalization, now 
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render them manageable. However, they are managed in the wrong way whenever we interpret if-then 
rules as randomized logical formulas of the material-implication type, instead of statements of conditional 
probabilities”. Keep in mind that this Pearl’s statement is however given to show the semantic clash 
between the Dempster-Shafer reasoning vs. the fallacious Bayesian reasoning to support the Bayesian 


reasoning approach. 


12.5 The Dezert-Smarandache reasoning 


We analyze here the Tweety penguin triangle problem with the DSmT (see Part I of this book for a 
presentation of DSmT). The prior knowledge characterized by the rules R = {r1, 72,73} and convictions 
W = {wi, we, w3} is modeled as three independent sources of evidence defined on separate minimal and 
potentially paradoxical (i.e internal conflicting) frames 0, £ {p, f}, O2 £ {b, f} and O3 £ {p,b} since 
the rule rı doesn’t refer to the existence of b, the rule rə doesn’t refer to the existence of p and the rule 
r3 doesn’t refer to the existence of f or f. Let’s note that the DSmT doesn’t require the refinement of 
frames as with DST (see previous section). We follow the same analysis as in previous section but now 


based on our DSm reasoning and the DSm rule of combination. 


The first source 6, relative to rı with confidence wı = 1 — €, provides us the conditional belief 
Beli(f|p) which is now defined from a paradoxical basic belief assignment m4(.) resulting of the DSm 
combination of m¥ (p) = 1 with m//(.) defined on the hyper-power set D®! = {0,p, f,pN f,pU f}. The 
choice for m{(.) results directly from the derivation of the DSm rule and the application of the MCP. 
Indeed, the non null components of mı(.) are given by (we introduce explicitly the conditioning term in 


notation for convenience): 


The information Bel; (f|p) = 1 — e, implies 


Beli (flp) = mi(flp) + m (pN flp) =1- 
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Since m1(p|p) + m1(p N flp) = 1, one has necessarily mı(f|p) = 0 and thus from previous equation 


mı(f N plp) = 1 — €1, which implies both 


Applying the MCP, it results that one must choose 


mi(f)=1—e and mi(pnf)=0 


The sum of remaining masses of m/(.) must be then equal to «1, i.e. 


mi(p) + mi (pU f) =e 
Applying again the MCP on this last constraint, one gets naturally 
mi (p)=0 and mi(pUf)=e 


Finally the belief assignment mj,(.|p) relative to the source Bı and compatible with the constraint 
Bel; (flp) = 1 — 1, holds the same numerical values as within the DST analysis (see eqs. ([2.20)-(221)) 


and is given by 


mi(pN flp) =1-a 


m1 (p|p) = €1 





but results here from the DSm combination of the two following assignments (i.e. m1(.) = [m1 @m{](.) = 
[my © mi}(.)) 


m(f)=1-ea and mi(pU f) =e 
(12.34) 


mi (p) = 1 
In a similarly manner and working on 02 = {b, f } for source By with the condition Bels(f|b) = 1-2, 
the mass m2(.|b) results from the internal DSm combination of the two following assignments 


m5(f)=l—-e and m(bU f) =e. 
(12.35) 


mg(b) =1 


Similarly and working on 03 = {p,b} for source B3 with the condition Bel3(b|p) = 1 — €3, the mass 


m3(.|p) results from the internal DSm combination of the two following assignments 


m3(b) =1—e3; and m(bUp)=e3 
> ° (12.36) 
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It can be easily verified that these (less specific) basic belief assignments generates the conditions 


Beli (f |p) =1- El; Bela (f |b) =1— €2 and Bels (b|p) =1- €3. 


Now let’s examine the result of the fusion of all these masses based on DSmT, i.e by applying the 


DSm rule of combination of the following basic belief assignments 


mi(pN flp)=1—e and mı(pļp) = «1 
m2(bN f|b) =1—e2 and mea(blb) = «2 
m3(pNblp)=1—e3; and msa(pl|p) = €3 


Note that these basic belief assignments turn to be identical to those drawn from DST framework 
analysis done in previous section for this specific problem because of integrity constraint f N f = Ø and 
the MCP, but result actually from a slightly different and simpler analysis here drawn from DSmT. So 
we attack the TP2 with the same information as with the analysis based on DST, but we will show that 


a coherent conclusion can be drawn with DSm reasoning. 


Let’s emphasize now that one has to deal here with the hypotheses/elements p, b, f and f and thus our 
global frame is given by © = {b, p, f, f}. Note that © doesn’t satisfy Shafer’s model since the elements of 
© are not all exclusive. This is a major difference between the foundations of DSmT with respect to the 
foundations of DST. But because only f and f are truly exclusive, i.e. f N f = 0, we are face to a quite 
simple hybrid DSm model M and thus the hybrid DSm fusion must apply rather than the classic DSm 
tule. We recall briefly here (a complete derivation, justification and examples can be found in chapter 
Ø) the hybrid DSm rule of combination associated to a given hybrid DSm model for k > 2 independent 


sources of information is defined for all A € D® as: 
maaco(A) Ê 6(A)[$1(A) + $2(A) + S3(4) (12.37) 


where @(A) is the characteristic non emptiness function of the set A, i.e. 6(A) =1if A¢ Ø (0 £ {0,0} 


being the set of all relatively and absolutely empty elements) and ¢(A) = 0 otherwise, and 


Sı(4) £ 5 Il mi( Xi) (12.38) 


X1,X2,..,X,ED2 t=1 


ae) 2 [[ mx) (12.39) 
X1,Xo,...,X~pE0 i=l 
[(U=A}V[(UE0)A(A=1,)] 
k 
Slas D [mx (12.40) 


X1,X2,..,X,ED2° t=1 
(X1UX2U...UXk)=A 
(X1NX2N...NXk)E0 
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with U £ u(X1) U u(X2) U... U u(Xp) where u(X) is the union of all singletons 6; that compose X and 
I, £0, U02U...U 9», is the total ignorance defined on the frame © = {6,,...,9,}. For example, if X is 
a singleton then u(X) = X; if X = 6102 or X = 6; U 2 then u(X) = 0, Ube; if X = (0, N 02) U 83 then 
u(X) = 6, U 02 U 63; by convention u(0) = 0. 


The first sum Sı(A) entering in the previous formula corresponds to mass m ms(o) (4) obtained by 
the classic DSm rule of combination based on the free DSm model M* (i.e. on the free lattice DÌ). The 
second sum $3(A) entering in the formula of the hybrid DSm rule of combination (12.37) represents the 
mass of all relatively and absolutely empty sets which is transferred to the total or relative ignorances. 
The third sum $3(A) entering in the formula of the hybrid DSm rule of combination transfers 
the sum of relatively empty sets to the non-empty sets in the same way as it was calculated following the 
DSm classic rule. 

To apply the hybrid DSm fusion rule formula (Z237), it is important to note that (pNf)A(bNf)Np = 
pnbnfaf = because fN f = 0, thus the mass (1 —€,)(1—€2)e3 is transferred to the hybrid proposition 


Hı ê (pNf)U(ON f) Up = (bN f) U p; similarly (pN fN (WN Ff) AN (pnb) = pnbnfnf=9 





because f N f = Ø and therefore its associated mass (1 — €)(1 — €)(1 — 3) is transferred to the hybrid 
proposition Hy £ (pn f) U (bN f) U (pnb). No other mass transfer is necessary for this Tweety Penguin 
Triangle Problem and thus we finally get from hybrid DSm fusion formula the following result 
for m423(.|[p b) = [m1 ® m2 @ m3](.) (where © symbol corresponds here to the DSm fusion operator and 


we omit the conditioning term pN b here due to space limitation): 


mız3((b N f) U plp N b) = (1 —e1)(1 — €2)es 
miz3((PN F) U (EN f) U (pO b)pN b) = (1 — €1)(1 — €2)(1 — €3) 


m423(p N bA flpo b) = (1 — €1)€2€3 In (1 — €1)é€2(1 me €3) = (1 — €1)€2 











mi23(p NbN flpAb) = e(l €2)€3 e(l €2)(1 €3) = €1(1 — €2) 











my423(p A) blp b) = €1€2€3 + €1€9(1 — €3) = €1€2 


We can check all these masses add up to 1 and that this result is fully coherent with the rational 


intuition especially when €3 = 0, because non null components of m123(.|p N b) reduces to 


mio3((pN f) U (bN f) U (pN b)|pNb) = (1 — e1)(1 — 2) 
mi23(pNbN flp Nb) = (1 — eae 


mMız3(pN bN flp NAb) = e (1 — €2) 





mMı23(p N blp NM b) = €1€2 
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which means that from our DSm reasoning there is a strong uncertainty (due to the conflicting rules 
of our rule-based system), when €; and €2 remain small positive numbers, that a penguin-bird animal 
is either a penguin-nonflying animal or a bird-flying animal. The small value €,€2 for mj23(p N b|p N b) 
expresses adequately the fact that we cannot commit a strong basic belief assignment only to pN b know- 
ing p Nb just because one works on © = {p,b, f, f} and we cannot consider the property p N b solely 


because the” birdness” or ” penguinness” property endow necessary either the flying or non-flying property. 


Therefore the belief that the particular observed penguin-bird animal Tweety ( corresponding to 
the particular mass mo(T = (pM b)) = 1) can be easily derived from the DSm fusion of all our prior 


summarized by mı23(.|p N b) and the available observation summarized by m,(.) and we get 


mMo123(T = (pn bn f)|T = (pNb)) = (1 = ex)ee 

Moi23(T = (PN bN f)|T = (pN b)) = a (1 — e2) 
moiza(T = (pN b)|T = (pN b)) = eres 

moiz3(T = (bN f) Up|T = (pN b)) = (1 — €1)(1 — €2)e3 


Mo123(T = (pN F) U (bN f) U(pNb)|T = (pN b)) = (1 — €1)(1 — €2) (1 — €3) 

















From the DSm reasoning, the belief that Tweety can fly is then given by 
Bel(T = fT =(pNb))= XO  maz(T = a|T = (pnb) 
2e€D® ,xCf 


Using all the components of mo123(.|T = (pM b)), one directly gets 
Bel(T = f\T = (p N b)) = Mo123(T = (f N bN p)|T = (pn b)) 


and finally 
Bel(T = f|T = (pN b)) = e111 — €2) (12.41) 


In a similar way, one will get for the belief that Tweety cannot fly 
Bel(T = f|T = (pN b)) = e2(1 — 1) (12.42) 


So now for both cases the beliefs remain very low which is normal and coherent with analysis done 
in section [2.3.2] Now let’s examine the plausibilities of the ability for Tweety to fly or not to fly. These 
are given by 


PUT = f|T = (pN b)) £ 5 mMo123(T = 2|T = (pN b)) 
reED® «nfFo 


PUT = f|T = (pN b)) £ 5 Moi23(T = z|T = (pN b)) 
reED®? sN f0 
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which turn to be after elementary algebraic manipulations 


PUT = f|T = (pN b)) =(1—e2) (12.43) 


PUT = f|T = (pnb)) = (1 — €) (12.44) 


So we conclude, as reasonably/rationally expected, that we can’t decide on the ability for Tweety of 


flying or of not flying, since one has 


[Bel(f|p N b), Pl(flp N 6)] = [ex(1 — e2), (1 — €2)] ~ [0, 1] 
[Bel(f|p N b), P\(flp N b)] = [e2(1 — €1), (1 — €1)] ~ [0,1] 


Note that when setting € = 0 and €2 = 1 (or €; = 1 and €g = 0), i.e. one forces the full consistency 
of the initial rules-based system, one gets coherent result on the certainty of the ability of Tweety to not 


fly (or to fly respectively). 


This coherent result (radically different from the one based on Dempster-Shafer reasoning but starting 


with exactly the same available information) comes from the hybrid DSm fusion rule which transfers some 





parts of the mass of empty set m(Ø) = (1 — e1)(1 — eg)e3 + (1 — e1)(1 — €2)(1 — eg) & 1 onto propositions 
(WN f)Up and (PN f) U(6N F) U (pnb). 


It is clear however that the high value of m(Ø) in this TP2 indicates a high conflicting fusion problem 
which proves that the TP2 is a true almost impossible problem and the fusion result based on DSmT 
reasoning allows us to conclude on the true undecidability on the ability for Tweety of flying or of not 
flying. In other words, the fusion based on DSmT can be applied adequately on this almost impossible 
problem and concludes correctly on its indecibility. Another simplistic solution would consist to say 


naturally that the problem has to be considered as an impossible one just because m(Ø) > 0.5 . 


12.6 Conclusion 


In this chapter we have proposed a deep analysis of the challenging Tweety Penguin Triangle Problem. 
The analysis proves that the Bayesian reasoning cannot be mathematically justified to characterize the 
problem because the probabilistic model doesn’t hold, even with the help of acceptance of the principle 
of indifference and the conditional independence assumption. Any conclusions drawn from such repre- 
sentation of the problem based on a hypothetical probabilistic model are based actually on a fallacious 
Bayesian reasoning. This is a fundamental result. Then one has shown how the Dempster-Shafer reason- 


ing manages in what we feel is a wrong way the uncertainty and the conflict in this problem. We then 
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proved that the DSmT can deal properly with this problem and provides a well-founded and reasonable 


conclusion about the undecidability of its solution. 
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